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Abstract 

Let A be a matrix whose columns are independent 

random vectors in M"'. Assume that the tails of the 1-dimensional 
marginals decay as P(| (V,a) | > t) < uniformly in o G 
and i < N. Then for p > 4 we prove that with high probability 
Ajy/n has the Restricted Isometry Property (RIP) provided that Eu¬ 
clidean norms \Xi\ are concentrated around y/n. We also show that 
the covariance matrix is well approximated by the empirical covariance 
matrix and establish corresponding quantitative estimates on the rate 
of convergence in terms of the ratio n/N. Moreover, we obtain sharp 
bounds for both problems when the decay is of the type exp(—f") with 
a G (0, 2], extending the known case a G [1, 2], 
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1 Introduction and main results 


Fix positive integers n, N and let ^4 be an n x random matrix whose 
columns Xi,... ,X]S[ are independent random vectors in M". For a subset 
/ C ,N} of cardinality m, denote by the n x m matrix whose 

columns are Xi,i G I. We are interested in estimating the interval of fluc¬ 
tuation of the spectrum of some matrices related to A when the random 
vectors Xi, i < N have heavy tails; hrstly, uniform estimates of the spec¬ 
trum of {A^YA^ which is the set of squares of the singular values of A^ , 
where I runs over all subsets of cardinality m for some hxed parameter m 
and secondly estimates for the spectrum of AAA . The hrst problem is related 
to the notion of Restricted Isometry Property (RIP) with m a parameter of 
sparsity whereas the second is about approximation of a covariance matrix 
by empirical covariance matrices. 

These questions have been substantially developed over recent years and 
many papers devoted to these notions were written. In this work, we say that 
a random vector X in satishes the tail behavior H(0) with parameter 
r > 1, if 

H(0) : Va G Vf > 0 P (| (X, a) | > t) < r/0(f) (1) 

for a certain function 0 and we assume that X, satishes H(0) for all i < N. 
We will focus on two choices of the function 0, namely 0(t) = t^, with p > 4, 
which means heavy tail behavior for marginals, and 0(t) = (1/2) exp(f“), 
with a G (0, 2], which corresponds to an exponential power type tail behavior 
and extends the known subexponential case (a = 1, see BED- 

The concept of the Restricted Isometry Property was introduced in [10] 
in order to study an exact reconstruction problem by ii minimization algo¬ 
rithm, classical in compressed sensing. Although it provided only a sufficient 
condition for the reconstruction, it played a decisive role in the development 
of the theory, and it is still an important property. This is mostly due to 
the fact that a large number of important classes of random matrices have 
RIP. It is also noteworthy that the problem of reconstruction can be refor¬ 
mulated in terms of convex geometry, namely in terms of neighborliness of 
the symmetric convex hull of Xi,..., Xat, as was shown in |12j . 

Let us recall the intuition of RIP (for the dehnition see ((9|) below). For 
an n X X matrix T and 1 < m < X, the isometry constant of order m of T is 
the parameter 0 < 6miT) < 1 such that the square of Euclidean norms \Tz\ 
and \z\ are approximately equal, up to a factor 1 -|- 6miT), for all m-sparse 
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vectors z G (that is, |supp(z)| < m). Equivalently, this means that for 
every I C {1,... ,N} with |J| < m, the spectrum of is contained 

in the interval [1 — 6miT),l + SmiT)]. In particular when 5m{T) < 6 for 
small 9, then the squares of singular values of the matrices belong to 
[1 — 6^, 1+0], Note that in compressed sensing for the reconstruction of vectors 
by minimization, one does not need RIP for all 6^ > 0 (see jl2j and mi)- 
The RIP contains implicitly a normalization, in particular it implies that the 
Euclidean norms of the columns belong to an interval centered around one. 

Let y4 be an n X random matrix whose columns are Xi,... ,X 7 v. In 
view of the example of matrices with i.i.d. entries, centered and with variance 
one, for which E|Xjp = n, we normalized the matrix by considering A/^/n 
and we introduce the concentration function 


P{e) ;= 



1 *1 

max 


V 

n 



( 2 ) 


Until now the only known cases of random matrices satisfying a RIP were 
the cases of subgaussian [91 EHl [121 123] and subexponential |1] matrices. Our 
hrst main theorem says that matrices we consider have the RIP of order 
m, with “large” m of the form m = mp{n/N) with 'ip depending on cp and 
possibly on other parameters. In particular, when N is proportional to n, 
then m is proportional to n. We present a simplihed version of our result, 
for the detailed version see Theorem 13.11 below. 


Theorem 1.1 Let 0 < 6 < 1. Let A be a random n x N matrix whose 
columns Xi,..., are independent random vectors satisfying hypothesis 
H(0) for some 0. Assume that n, N are large enough. Then there exists a 
function ip depending on (p and 9 such that with high probability (depending 
on the concentration function P{9)) the matrix A/y/n has RIP of order m = 
[mp^n/N)] with a parameter 9 (that is, 5m{,A/^/n) < 9). 

The second problem we investigate goes back to a question of Kannan, 
Lovasz and Simonovits (KLS). As before assume that A is a random n x N 
matrix whose columns Xi,..., X^ are independent random vectors satisfying 
hypothesis H(0) for some (p. Additionally assume that Xj’s are identically 
distributed as a centered random vector X. KLS question asks how fast 
the empirical covariance matrix U := {1/N)AAA converges to the covariance 
matrix S := {1/N)KAA~^ = KU. Of course this depends on assumptions 
on X. In particular, is it true that with high probability the operator norm 
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||t/ — S|| <e:||E|| for N being proportional to n? Originally this was asked for 
log-concave random vectors but the general question of approximating the 
covariance matrix by sample covariance matrices is an important subject in 
Statistics as well as on its own right. The corresponding question in Random 
Matrix Theory is about the limit behavior of smallest and largest singular 
values. In the case of Wishart matrices, that is when the coordinates of X 
are i.i.d. centered random variables of variance one, the Bai-Yin theorem |B] 
states that under assumption of boundedness of fourth moments the limits of 
minimal and maximal singular numbers of U are (1 ± \/^)^ as n, Y —>■ oo and 
n/N —)> /3 G (0,1). Moreover, it is known [TJ |30] that boundedness of fourth 
moment is necessary in order to have the convergence of the largest singular 
value. The asymptotic non-limit behavior (also called “non-asymptotic” in 
Statistics), i.e., sharp upper and lower bounds for singular values in terms of 
n and N, when n and N are sufficiently large, was studied in several works. 
To keep the notation more compact and clear we put 


M := max \XA, 

i<N ' ” 


S := sup 

aeS"-l 


1 

N 


N 




(3) 


Note that if E(X, a)^ = 1 for every a G 5'"'“^ (that is, X is isotropic), then 
the bound S < e is equivalent to the fact that the singular values of U belong 
to the interval [1 — e, 1 -1- e]. For Gaussian matrices it is known f |13[ l3^ 1 that 
with probability close to one 


^ (4) 

where G is a positive absolute constant. In HE] the same estimate was 
obtained for a large class of random matrices, which in particular did not 
require that entries of the columns are independent, or that Yds are identi¬ 
cally distributed. In particular this solved the original KLS problem. More 
precisely, (jl]) holds with high probability under the assumptions that the Yds 
satisfy hypothesis H((d) with (^(t) = e*/2 and that M < G(Yn)^/^ with high 
probability. Both conditions hold for log-concave random vectors. 

Until recent time, quite strong conditions on the tail behavior of the one 
dimensional marginals of the Y* were imposed, typically of sub exponential 
type. Of course, in view of Bai-Yin theorem, it is a natural question whether 
one can replace the function (j){t) = e*/2 by the function (j){t) = with 

a G (0,1) or (j){t) = G, for p > 4. The hrst attempt in this direction was done 
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in [M], where the bound S < C{p, K){n/NY^‘^~'^^P{ln\nny was obtained for 
every p > A provided that M < K^/n. Clearly, In Inn is a “parasitic” 
term, which, in particular, does not allow to solve the KLS problem with 
N proportional to n. This problem was solved in [2ll [31] under strong 
assumptions and in particular when M < Ky/n and X has i.i.d. coordinates 
with bounded p-th moment with p > 4. Very recently, in [2S], the “right” 
upper bound S < C{n/NY^'^ was proved for p > 8 provided that M < 
C{NnY/^. The methods used in [25] play an influential role in the present 
paper. 

The problems of estimating the smallest and the largest singular values 
are quite different. One expects weaker assumption for estimating the small¬ 
est singular value. This already appeared in the work [21] and was pushed 
further in recent works [T9ll22ll35] and in [Hi[201126] which led to new bounds 
on the performance of £i-minimization methods. 

In this paper we solve the KLS problem for 4 < p < 8, in Theorem 11.21 
Our argument works also in other cases and makes the bridge between the 
known cases p > 8 and the exponential case. 


Theorem 1.2 Let Xi,... he independent random vectors in M" satis¬ 
fying hypothesis H(0) with (fit) = t^ for some p G (4,8]. Let e G (0,1) and 
7 = p — 4 — 2£>0. Then 


S <C 



(5) 


with probability larger than 1 — Se ^ — 2e ^/^max{X 


In particular, if N is proportional to n and M^/n is bounded by a constant 
with high probability, which is the case for large classes of random vectors, 
then with high probability 

S<C . 


Let X have i.i.d. coordinates distributed as a centered random variable 
with hnite p-th moment, p > 2. Then by Rosenthal’s inequality ([29], see 
also m and Lemma [6.31 belowL X satishes hypothesis 'a{(f)) with (f){t) = t^. 
Let Xi, ..., Xat be independent random vectors distributed as X. It is known 
(0. n. see also [22] for a quantitative version) that when N is proportional 
to n and in the absence of fourth moment, M‘^/n —)■ cxo as n —)■ oo. Hence, 
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bounds for S involving the term M‘^/n like the bound ([S]) are of interest only 
for p > 4. We don’t know if it holds in the case p = 4. 

The main novelty of our proof is a delicate analysis of the behavior of 
norms of submatrices, namely quantities Ak and Bh, k < N, dehned in (j6|l 
below. This analysis is done in Theorem 12.11 which is in the heart of the 
technical part of the paper and it will be presented in the next section. The 
estimates for are responsible for RIP, Theorem II.11 while the estimates 
for Ak are responsible for KLS problem. Theorem 11.21 

As usual in this paper C, Cq, Ci, ..., c, cq, ci, ... always denote absolute 
positive constants whose values may vary from line to line. 

The paper is organized as follows. In Section [2], we formulate the main 
technical result. For the reader convenience, we postpone its proof till Sec¬ 
tion O In Section [21 we discuss the results on RIP. The fully detailed formu¬ 
lation of the main result in this direction is Theorem 13.11 while Theorem 11.11 
is its very simplihed corollary. In Section 01 we prove Theorem II.21 as a conse¬ 
quence of Theorem 14.51 The case p > 8 and the exponential cases are proved 
in Theorem 14.71 using the same argument. Symmetrization and formulas for 
sums of the k smallest order statistics of independent non-negative random 
variables with heavy tails allow to reduce the problem on hand to estimates 
for Ak- In the last Section [6l we discuss optimality of the results. 

An earlier version of the main results of this paper was announced in [T^ . 

Acknowledgment. A part of this research was performed while the authors 
were visiting at several universities. Namely, the hrst named author visited 
University of Alberta at Edmonton in April 2013 and the second and the 
fourth named author visited University Paris-Est in June 2013 and in June 
2014. The authors would like to thank these universities for their support 
and hospitality. 


2 Norms of submatrices 

We start with a few general preliminaries and notations. We denote by i ?2 
and 5'”'“^ the standard unit Euclidean ball and the unit sphere in M” and by 
I ■ I and (•, •) the corresponding Euclidean norm and inner prodnct. Given 
a set E C {1,...,A}, \E\ denotes its cardinality and B^ denotes the nnit 
Euclidean ball in M'®, with the convention R® = {0}- 
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A standard volume argument implies that for every integer n and for every 
e: G (0,1) there exists an e-net A C B 2 of of cardinality not exceeding 
(1 -|- 2/e)”'; that is, for every x G B 2 , min^gA \x — y\ < e. In particular, if 
e < 1/2 then the cardinality of A is not larger than (2.5/e)”. 

By M. we denote the class of increasing functions 0 : [0, cx)) —)• [0, 00 ) 
such that the function ln0(l/y^) is convex on (0, 00 ). The examples of 
such functions considered in this paper are 0(a:) = for some p > 0 and 
0(a:) = (1/2) exp(a:") for some a > 0. 

Recall that the hypothesis H(0) has been dehned in the introduction by 
Note that this hypothesis is satished if 

sup E0(| {X,a) I) < T. 

aeS"-! 

For k < N and random vectors Xi, ..., Xat in M” we dehne and B^ by 



N 


N 

2 N 

Ak := sup 


, Bl := sup 



aes^-i 

|supp(a) 1 <fc 

i=l 

aes^-i 

|supp(a) 1 < k 

i=l 

i=l 


We would like to note that is the supremum of norms of submatrices 
consisting of k columns of A, while Bk plays a crucial role for RIP estimates. 

We provide more details on the role of Ak and Bk in the next section. 

Recall also a notation from the introduction 

M = max \XA. 

i<N ' ' 

We formulate now the main technical result. Theorem 12.11 which is the 
key result for both bounds for Ak and for Bk. The role of Ak and Bk in RIP 
estimates will be explained in the next section. We postpone the proof to 
Section [5l 

Theorem 2.1 Let p > 4, a G {2,p/2), a G (0,2], t > 0, and r, A > 1. Let 
Xi,... ,X]\[ be independent random vectors in M” satisfying hypothesis H(0) 
with parameter r either for 4>{x) = x^ or for 4>{x) = (1/2) exp(a:“). For 
k < N define Mi, jd and in two following cases. 

Case 1. (j){x) = x'P. We assume that X < p and we let = e^. 

Ml := Ci{a,X,p)'/k i—j and fd := C 2 {(t, X){TN)~^+C 3 {a, X,p)—^, 
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where 


Ci{a,X,p) 


32 e" 


a + A 
1 + A/2 


/ 2p 

\p-2a) 


a + Ay/^ 
a-2) 


{20ey/P, 


C'2(ct,A) 


|^ 2(a + A) y 1 

\5e{a-2)J 2A - 1 


and C 3 {a,X,p) 


(a + A)P 
4(2(a-2))p' 


Case 2. (f){x) = (1/2) exp(a;"). We assume that A > 2 and we let = 

where C is an absolute positive constant, 


Ml 


(CA)^/“ \fk 




1/a. 


and 

1 / A/c"/^ A N'^t 

(IONt)^ ^ \ (3.5 \n{2k)y°‘) 2exp((2t)“) 

In both cases we also assume that fl < 1/32. Then with probability at least 
1 — ^/J3 one has 


Ak<{l- (m + 2^/C/jM + Ml) 


and 

Bl<{l- 4^)-2 + {^C^t + Ml) M + 2M/) . 


We would like to emphasize that and are of different nature. In 
particular, Theorem 12. II in the case (t){x) = has to be applied with different 
choices of the parameter a. We summarize those choices in the following 
remark. 

Remark. In the case (p{x) = x^ we will use the following two choices for a: 
1. Choosing a = p/A and assuming p > 8 we get 


Ml 




1/4 


/ 2(p + 4A) y 1 NW{p + AX)P 
\5eNT{p -8)) 2A-1 ^ A(2t(p - 8 ))p ' 


and 





















2. Choosing a = 2 + e with e: < min{l, (p — 4)/4}, we get 


and 


Mi<C 



l+(4+2e)/p 



{2+s)/p 


/2(3 + A)y 1 iVV(3 + A)P 
\ deeNr ) 2A — 1 4{2et)P 


Remarks on optimality. 

1. The case 0(x) = p > 4. Let r > 1, > (64C2(cr, A))and 

t = {64:N'^Cs{a, X,p)Y^P. Then /? < 1/32 and YtM < C^a, X,p){M + Mi). 
Hence with probability larger than 3/4 we have 

Ak < C{a, A,p) {m + ^{Nr/ky/P^ . 

In Proposition 16.51 below we show that there exist independent random vec¬ 
tors Xj’s satisfying the conditions of Theorem 12.11 with r = 1 and such that 

Ak > C{p)Vk{N/ky/P {\n{2N/k)y^^^ 


with probability at least 1/2. Note that M = Ai < Ak- Therefore 

Ya>ix{M,C{p)Vk{N/ky>’’{\Yi{2N/k))~'''‘} < Ay < C(a,A,p) [m + xThiN/kfl^ 

(7) 

with probability at least 1/4. 

2. The case (pix) = (1/2) exp(x“), a G [1,2]. Let A = 2 and t = (IniV)^/". 
Then (3 < 1/32. Hence with probability larger than 3/4 we have 

Ak<c[M + C^/'^y/k (ln(6i/“rAr/fc)) . 

In Proposition 16.71 below we show that there exist independent random vec¬ 
tors X/s satisfying the conditions of Theorem 12.11 with r bounded by an 
absolute constant and such that 

Ak>y/^{HN/{k + l))f/^ 


with probability at least 1/2. Using again that M = Ai < Ak we observe 


max{M, y/k/2 {\n{N/{k + 1)))^^“} < Ak < C (^M + (ln(6^/"iV/A;))^^") 

( 8 ) 


with probability at least 1/4. 
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3 Restricted Isometry Property 


We need more definitions and notations. 

Let T be an n X iV matrix and let 1 < m < iV. The m-th isometry 
constant of T is defined as the smallest number 5m = 5m{T) so that 

{l-5m)\z?<\Tz\^<{l + 5m)\z\^ (9) 

holds for all vectors z G with |supp( 2 ;)| < m. For m = 0, we put 5q{T) = 
0. Let 5 G (0,1). The matrix T is said to satisfy the Restricted Isometry 
Property of order m with parameter 5, in short RIPm(5), if 0 < 5m{T) < 5. 

Recall that a vector G is called m-sparse if |supp(z)| < m. The 
subset of m-sparse unit vectors in is denoted by 

Um = := {z G : | 2 ;| = 1, |supp( 2 ;)| < m}. 


Let Xi, ..., Xtv be random vectors in M” and let A be the n x N matrix 
whose columns are the Xj’s. By the dehnition of Bm (see ([6|)) we clearly have 


max 

i<N 




< + max 

n i<N 



( 10 ) 


Thus, in order to have a good bound on 5m{,A/y/n) we require a strong 
concentration of each \Xi\ around ^/n and we need to estimate Bm- 

To control the concentration of |Xj| we consider the function P{0), dehned 
in the introduction by ([2]). Note that this function estimates the concentra¬ 
tion of the maximum. Therefore, when it is small, we have much better 
concentration of each \Xi\ around i/n. 

We are now ready to state the main result about RIP. Theorem II.1[ 
announced in the introduction, is a very simplihed form of it. 


Theorem 3.1 Let p > 4, a G (0, 2], r > 1 and 1 < n < N. Let Xi ,..., Xn 
be independent random vectors in satisfying hypothesis H(0) with the 
parameter T either for (p^x) = or for cp^x) = (1/2) exp(x“). Let P{ ) he as 
in ^ and 9 G (0,1). 

Case 1. (p{x) = x^. Let e < min{l, (p — 4)/4}. Assume that 

—— < N < c6{ce6Y^‘^ 
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and set 


m = 


yy—X —2(2+£)/(p—4—2£) 

C{e,e,p)n ( — j 


and fd = 


5P ivV 


4(2c£ 6')p?t.p/^ ’ 


where 

C{0,e,p) = c 


p — A 

p 


2(p+4+2e)/(p-4-2e) 


^4(2+£)/(p-4-2e) ^2p/(p-4-2£)^ 


c and C are absolute positive constants. 
Case 2. (j){x) = (1/2) exp(x"). Assume that 


1 

— max 

T 


and set 

and 


/9 = 


{2i/“,4/d} < iV < cd^/^exp ((1/2) (cd^/n)") 
n (ln(C2/“ Arr/(d2 n))) 

-2m"/2 


m = 


(lOiVr) 


■ exp 


N^t 


, , , iH- exp (—c (6\/n)°') , 

(3.51n(2m))2“/ 2 ^ ^ V v ; ; , 


where c and C are absolute positive constants. 
Then in both cases we have 


P {6UA/M < d) > 1 - - P{e/2). 

Remarks. 1. Note that for instance in case 1, the constraint N < c{9, e, r, p)n^A 
is not important becanse for N S> one has 


m = 


N _2(2+£)/(p-4-2£)' 

C{e,e,p)n ( — j 


= 0 . 


A similar remark is valid in the second case. 

2. In most applications P{0) —)■ 0 very fast as n,N —)■ oo. For example, 
for so-called isotropic log-concave random vectors it follows from resnlts of 
Paonris ([23 [28], see also [HI |T6] or Lemma 3.3 of [1]). As another exam¬ 
ple consider the model when X/s are i.i.d. and moreover the coordinates of 
Xi are i.i.d. random variables distribnted as a random variable In the 
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case when ^ is of variance one and has hnite p-th moment, p > 4, then by 
Rosenthal’s inequality P{9) is well bounded (for a precise bound see Corol¬ 
lary [63] below, see also Proposition 1.3 of |3T]). Another case is when ^ is 
the Weibull random variable of variance one, that is consider ,^o such that 
lP(l^o| > t) = exp (—1“) for a G (0,2] and let ^ = .^o/\/E^- By Lemma 3.4 
from [1] (see also Theorem 1.2.8 in [II]), P{6) satishes fiTT]) below. 

3. Optimality. Taking £ in the Case 1 of order (p — 4)^/ln(2A^r/n) and 
assuming that it satishes the condition of the theorem, we observe that in 
Case 1 


m = 


C{9,p) n 


^^y4/(p-4) 



-8/(p-4) 


Moreover, Proposition 16.61 below shows that for g > p > 4 there are in¬ 
dependent random vectors Xj’s satisfying hypothesis H(0) with parameter 
r = r(p, g) and such that for 9 = 1/2, N < C{p,q)rpP {\n{2N one 
can’t get better estimate than 


m<8{N/n) n. 


4. Optimality. In Case 2 with r bounded by an absolute constant and a G 
[1, 2], let Con <N<C\ exp(c 2 n“/^), 9 = 0.4 and assume that P{9/2) is small 
enough. Then P((5m < 1/2) > 1/2 provided that m = n (Cln(C^/“A^/n))^^°'. 
Proposition 16.71 below shows that the estimate for m is sharp, that is in 
general m can’t be larger than m = n {C \n{2N. 

Proof. We hrst pass to the subset flo of our initial probability space where 


max 

i<N 



< 9/2. 


Note that by ([2|) the probability of this event is at least 1 — P{9/2) and if 
this event occurs then we also have 


max \Xi\ < 3\/n/2. 

i<N 

We will apply Theorem 12.11 with k = m, t = 9y/n/{100C^), where is 
the constant from Theorem 12.11 Additionally we assume that /3 < 2~^9^ and 
Ml < t. Then with probability at least 1 — ~ P{9/2) we have 

Bm ^ (ISv^ + 9/A)n < 9n/2. 
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Together with fflUD this proves 5m{A/^/n) < 6. Thus we only need to check 
when the estimates for f3 and Mi are satished. 

Case 1. (t){x) = x^. We start by proving the estimate for Mi. We let 
a = 2 + e, e < min{l, {p — 4)/4} and A = 2. Then by Theorem 12.11 (see also 
the Remark following it), for some absolute constant C we have 


Mi<C 



l+(4+2e)/p 


\ 2(2+£)/p 


\ m J 


Therefore the estimate Mi < cO^/n with c = l/(100e^) is satished provided 


that 


m = 


/AT \-2(2+£)/(p-4-2£) 

C{9,e,p)n{^—j 


with C{9,e,p) dehned in (ITT|) and the absolute constants properly adjusted. 

Now we estimate the probability. From Theorem 12.11 (and the Remark 
following it), with our choice of t and A we have 


a _I_ .‘■y ' 9fl2 

^ - 3e2£2Ar2r2 ^ 4:{2ce 9)PnP/^ “ 

provided that 2®/(e:6*r) < N < 2“^6*yT (0.4ce 6*)^^^ This completes the 
proof of the hrst case. 


Case 2. (p{x) = (1/2) exp(x“). As in the hrst case we start with the 
condition Mi < t. We choose A = 4. Note that Nr/m > 2^/" as Nr > 2}l°'n. 
Therefore for some absolute constant C, 

Ml < Vm{C\n{2NT/ m))^^“. 


Therefore the condition Mi < t is satished provided that 
m < n ^ln(C^^“ Nr/ (0^ n)) j 


for an absolute positive constant Ci. This justihes the choice of m. 

Now we estimate the probability. From Theorem 12.II with our choice of t 
and A we have 


/9< 


(10A^r)2 


exp 


-2m“/2 


(3.51n(2m))2« 


N^t 

H-exp 


-c{9i/^y)<2-^9^, 


provided that 4/(6'r) < N < 2 ^9^/Texp [c{9^/n)‘^). This completes the 
proof. □ 
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4 Approximating the covariance matrix 

We start with the following £-net argument for bilinear forms, which will be 
used below. 

Lemma 4.1 Let m > 1 be an integer and T be an m x m matrix. Let 
e: G (0, 1/2) and M C Bl^ be an e-net of (in the Euclidean metric). Then 

sup |(Ta;,a;)| < (1 - 2e)~^ sup \{Ty,y)\. 

xGB^ yeJV 

Proof. Let S' = T + T*. For any x,y ^ M™, 

{Sx, x) = {Sy, y) + {Sx, x - y) + {S{x - y),y). 

Therefore |(S'x,x)| < \{Sy,y) \ + 2|x — i/IIIS'H. Since S is symmetric, we have 

||S|| = sup |(Sx,x)|. 


Thus, if \x — y\ < e, then 

ll^ll < sup |(^|/, 2 /)| + 2£||^|| 
yeN 


and 

sup |(S'x,x)| < (1 - 2e)~^ sup \{Sy,y)\. 

x&B^ y&Af 

Since T is a real matrix, then for every x G M™, {Sx,x) = 2{Tx,x). This 
concludes the proof. □ 

Now we can prove the following technical lemma, which emphasizes the 
role of the parameter Ak in estimates of the distance between the covari¬ 
ance matrix and the empirical one. This role was first recognized in [8] 
and [2]. Other versions of the lemma appeared in [H |3]. Its proof uses the 
symmetrization method as in [25] . 

Lemma 4.2 Let T>l,l<k<N and Xi ,..., X^r be independent random 
vectors in Let p > 2, a G (0, 2]. Let cj) be either (f>{t) = t^ in which case 
we set assume 

Vl<i<X E\{Xi,a)\P <T, 


14 


or (pit) = (l/2)exp(t") in which case we assume that Xi’s satisfy hypothesis 
with parameter r and set Cip = Sy/CfPWr, where Ca = (S/a) r(4/Q!), 
r(-) is the Gamma function. Then, for every A, Z > 0, 

sup 
aeS"-l 


N 

E 

2=1 


i{Xi,a)^ -E{Xi,a)^) 


< 2^^ + 6^/f^Z + Cs 


with probability larger than 


1 - 4exp(-n) - 4P(Afc > A) - 4 x Q” sup 



The term involving Z in the upper bound will be bounded later using 
general estimates in Lemma 14.41 Thus Lemma 14.21 clearly stresses the fact 
that in order to estimate the distance between the covariance matrix and the 
empirical one, it will remain to estimate Ak, to get A. 

Proof: Let A C MT be a (l/4)-net of the unit Euclidean ball in the Euclidean 
metric of cardinality not greater than 9"’. Let (£i)i<i<Ar be i.i.d. ±1 Bernoulli 
random variables of parameter 1/2. By Hoeffding’s inequality, for every f > 0 
and every e 


N 


N 


P, 


Li) 


\^£iSi <2exp(-tV2). 


2=1 


2=1 


Fix an arbitrary 1 < k < N. For every {si)i<i<N G there exists a 
permutation tt of {1,..., A^} such that 


N 

I 

i=l 


N 


< 




'7r(2)'^ 


2=1 


2=fc + l 


where (s*)j denotes a non-increasing rearrangement of (|sj|)j. 

Also, it is easy to check using ([6]) that for any a G and any / C 

{1,..., A^} with I J| < k, ^ ^l- 

Thus, for every a G 


N 


N 


P, 


Li) 


2 = 1 


Y,e^{X,,a)^ > 1 - 2exp(-tV2). 


2=fc + l 
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Note that Y,i=i “ Y!i&E<^{Y, a)^ for some set E C 

{1,..., A^} and we can apply a nnion bonnd argnment indexed by A together 
with Lemma 14.11 We get that 


N 


P, 


hi) 


snp 
.aeS"- 


I ^ ^ (A"*, Cl) 


2 = 1 


< 2 


A\ + t snp 

asA 


N 

Y. ((-’f.. “>•)“) 


1 / 2 - 


i=k+l 


> 1 - 2 X 9’"exp(-tV2). 

Using again a nnion bonnd argnment and the triangle ineqnality to estimate 
the probability that the (Xj) satisfy 


1/2 

snp( {{X,,aYf) >Z, 


and choosing t = 302 (so that 2 ■ 9”exp(—1^/2) < e "■) we get that 


snp 

aeS"~ 


N 

j'^ei{Xi,aY 

2=1 


< 2A‘^ + 6y/nZ 


with probability larger than 


1 - e"^ 


P(Afc > A) - 9"' snp P 

ae5"-i 


N ^ ,2 

( Y «X. «)•)") 



Now we transfer the resnlt from Bernonlli random variables to centered 
random variables (see [21], Section 6.1). By the triangle ineqnality, for every 
s,t > 0, one has 


/ I ^ 

m(s)P snp I {{Xj, aY - E(X^, af) 


I> 5 “h t 


/ I 

< 2P snp I y^^ei{Xi,aY 
Vaes"-iI 



where Tn{s) = infag 5 n-i P ^ 


Eti((A.,a)2-E(W,a)2) <s 

To conclnde the proof it is enongh to hnd s so that m{s) > 1/2. 
this end we will nse a general Lemma [4.31 (below). First consider (j){t) = 


To 

tP. 
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For a e S'" set Zj = |(Xi, and q = p/2. Then by Lemma SSI 

we have m(s) > 1/2 for s = and r = min(p/2,2). Now consider 

(j){t) = (1/2) exp(f“). Then for every a G S‘^~^ and every i < N using 
hypothesis 11(0) we have 

E\{Xi,a)\‘^ <8 t / t^exp(—1“) (it = —T ( — ] := rCa. 

Jo « V«/ 

Given a G 5'"“^, set Zj = \{Xi, a)\‘^/y/rC//. Then EZf < 1. Applying again 
Lemma [4.31 (with g = 2), we observe that m(s) >1/2 for s = Ay/Cj/N. This 
completes the proof. □ 


It remains to prove the following general lemma. For convenience of the 
argument above, we formulate this lemma using two powers q and r rather 
than just one. 


Lemma 4.3 Let q > 1 and Zi,..., Zjv be independent non-negative random 
variables satisfying 

VI < i EZ/ < 1. 

Let r = min(g, 2), then 


\/z > 


N 


P 


(Z, - EZ,) 


2 = 1 



Proof: By dehnition of r, we have for alH = 1,..., A^, EZ/ < 1. Since the 
Zj’s are independent, we deduce by a classical symmetrization argument that 


N 


N 


N \ / N \ 

e| ^ (Zi - EZi) < 2EE(ep I ^ SiZi < 2E I ^ Z/ I < 2E I ^ Z[ 


2=1 


2 = 1 


. 2 = 1 


. 2=1 


since r G [1,2]. From EZ/ < 1, we get that 


N 


N 


lA / N \ lA 

e|^(Z.-EZ,) <2e{J2z;] <2 ^ EZ[ I < 2 AT Ay 


2 = 1 


. 2=1 


. 2=1 


By Markov’s inequality we get 


N 


P ^(Z,-EZ,) >4iVAM < 


2 = 1 
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and since z > this implies the reqnired estimate. □ 

The following lemma is standard (cf. Lemma 5.8 in [2l], which however 
contains a misprint). 


Lemma 4.4 Let q > 0 and let Zi ,..., be independent non-negative ran¬ 
dom variables satisfying 


VI < i Vt > 1 F{Zi >t)< 1/tL 
Then, for every s > 1, with probability larger than 1 — s~^, one has 

if 0 < q < 1 

2esiVln(f) zf q = 1 

12g(esp/-> N lfq>l. 

Proof: Assnme hrst that 0 < g < 1. It is clear that 



\/l<i<N 


p(z; >t)< 



< {Ne/it^iy, 


where we nsed the ineqnality (^) < {Ne/if. Thus if eNt < 1, then 


>t)< Y,(Nel0 = [fffv - 


i>k 


i>k 


Therefore if eNt~‘^ < 1/2, then P(supj>^ > t) < {2eNt~^Y. Since 

the inequality is trivially true if eNt~'^ > 1/2, it is proved for every f > 0. 
Therefore for g < 1 we have 


N OO 

i=k i=k 



ki-i/1 \ 

i-i/J 


1 - g 


with probability larger than 1 — {fleN/t^^Y. Choosing t = (2esiV)^/'?, we 
obtain the estimate in the case 0 < g < 1. 

For g = 1 we have 


N 


N 


i=k 


i=k 


< t ( — + ln(iV/fc) j < t ln{eN/k) 
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with probability larger than 1 — {2eN/t)^. To obtain the desire estimate 
choose t = 2esN. 

Now assnme that g > 1. Set i = |'log 2 fc]. The same compntation as 
before for the scale (2*/'^) instead of gives that 


P 


. i>i 



< ^ {Net-^Y < {2eNt-^Y'. 

i>i 


Note also that 


Thns 


>t)< (iVer^)^ 


N 


riog2 Afi 


^ z; <kZl^ ^ < t + ( 4 Ar)^-1/7(2^-^/^ - 1)) 


i=k 


i=l 


< t f < t 


1 — q J 1 — g 

with probability larger than {Net~‘^Y+{2Net~‘^Y. Thns, taking f = (4esiV)^/'^, 


we obtain 


N 


P Ez. 


. < 


, i=k 


12g(es)^/'' 
g - 1 


N\>l-s 


-k 


□ 


We are now ready to tackle the problem of approximating the covariance 
matrix by the empirical covariance matrices, nnder hypothesis with 

0(f) = fP. As onr proof works for all p > 4, we also inclnde the case p > 8 
originally solved in [22] (nnder additional assnmption on maxj |Xj|). For 
clarity, we split the resnlt into two theorems. The case 4 < p < 8 has been 
stated as Theorem 11.21 in the Introdnction. 

Before we state onr resnlt, let ns remark that p > 2 is a necessary con¬ 
dition. Indeed, let (ej)i<j<„ be an orthonormal basis of M"" and let Z be 
a random vector snch that Z = y/nci with probability 1/n. The covari¬ 
ance matrix of Z is the identity I. Let A be an n x random matrix 
with independent colnmns distribnted as Z. Note that if H^AA"'" — /|| < 1 
with some probability, then AA^ is invertible with the same probability. It 
is known (conpon collector’s problem) that N ~ nlogn is needed to have 
{Zi : i < N} = {^/nci : i < n} with probability, say, 1/2. Thus for vector 
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Z, the hypothesis is satished but N ~ nlogn is needed for 

the covariance matrix to be well approximated by the empirical covariance 
matrices with probability 1 / 2 . 

We also would like to mention that we don’t know how sharp the power 
7 /p appearing in the bound below is. In particular, it is not clear if it can 
be improved to 1 / 2 . 


Theorem 4.5 Let 4 < p < 8 and (j){t) = t^. Let Xi, ..., Xjsi be independent 
random vectors in M” satisfying hypothesis H(0). Let e < min{l, (p — 4)/4} 
and j = p — 4 — 2e. Then with probability larger than 

1 - Se-'^ - 2 £-p/ 2 max 

one has 

1 ^ 

Qggn-l iV 

where 

C{p,e) = (p- 4 )-i/ 2 ^-^( 2 +e)/p_ 
and C is an absolute constant. 



An immediate consequence of this theorem is the following corollary. 


Corollary 4.6 Under assumptions of Theorem m assuming additionally 
that maxj |Xjp < with high probability, we have with high prob¬ 

ability 


sup 

ae5"-i 


1 

N 


N 




E{A',,a)") 


<C\C(p,E) (^) 


n \'y/p 


where C and Ci are absolute positive constants. 


Theorem 4.7 There exists a universal positive constant C such that the 
following holds. Let p > 8 , a G (0, 2]. Let 0 and be either (fit) = t^ and 
C(j) = C or (fit) = (l/2)exp(f") and C^f, = LetX\,...,X^ be 

independent random vectors in M” satisfying hypothesis H(0). In the case 
(fit) = t^ we also define 

po = 8e-^ + 2 f ~ ^ ' Ar-(p-8)/8 

\6{p-8)) 
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and in the case 4>{t) = (1/2) exp(t"), we assume N > (4/a)®/" and define 


_ Q -n 1 f \ N'^ 

^ (10iV)4 V(3-51n(2n))2“ J ^ 2exp((2niV)«/4)' 

Then in both cases with probability larger than 1 — Po one has 


sup 

ae5"-i 


N 


N 


-E{X„ ay 


i=l 


^ c 2 ^ 

< — max Xi 2 + 

N i<N ^\J N 


As our argument works in all cases we prove both theorems together. 
Proof of Theorems 14.51 and 14.71 We hrst consider the case fi = t^. Note 
that in this case 


E|(W,a)|^ < 1+ / P(|W|^ > t) dt < 1 + / As^-Pds 


P 

p — 4 


Thus, by Lemma 14.21 it is enough to estimate + \/n Z + \/p/{p — 4) y/N 
and the corresponding probabilities. We choose k = n. 

In the case = t^ we apply Lemma 14.41 with Zi = \ (X*, a) |^, i < N, 
q = p/4> 1 and s = 9e. It gives 


P 



< (9e)-", 


for 

2 = ^ ^ /ilL(3e)4/P Wn, 

Vg — 1 y p — 4 

Now we estimate An, using Theorem 12.11 


Case 1: 4 < p < 8 (Theorem 14.5p . We apply Theorem 12.11 (and the 
Remark following it), with cr = 2 + £, where £ < (p — 4)/4, A = 3 and 
t = 3X2/Pn‘^ for d = 1/2 — 2/p. Then 

Ml < C(p,£)v^(X/n)(2+=)/P, 


where 


Co{p,e) = C 


p 


(p-4-2e)/p 


^ \ 2{2+e)/p 
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and 


1 / 12 V 


< £-Pmax{iV-^n"'^P} < 1/64 


^ 5 \5e£iV/ 4(p — 4)Pn'^P 

provided that n is large enough. Then, using 5 = 1/2 — 2/p, we obtain 

Al<C (niax|X/2 + iv2/Vniax|X/ +C'o^(p,£)n(iV/n)2(2+"^/P 


i<N 


i<N 


|X/2 + C'2(p,£)n(iV/n)2(2+-)/^’^ . 

Combining all estimates and noticing that (p — 4)“'>' < 2, we obtain that the 
desired estimate holds with probability 

1 - 8e"” - 2 £-p/ 2 max{iV-3/^ 


< 2C (max 
\ i<N 


Case 2: p > 8 fTheorem 14.7|) . In this case we apply Theorem 12.11 (see 
also the Remark following it), with a = p/4, A = (p — 4)/2, t = 3(?7,A^)^/^. 
Then Mi < C^/n{N/nY^ anii 


Y< 


2(3p-8) ^ (3p-8)P 

5e(p — 8)iV/ p — 5 4(6(p — 


< NAp-^)An-PA < 1/64, 

\6{p-8)J 

provided that N is large enough. Thus with probability at least 1 — ^/P we 
have 


Al < 


C ( max iXip + (nNAA-^g^ |xd + \/mV ) < 2C ( max IXJ^ + 

y i<N i<N j \ i<N 


Combining all estimates we obtain that the desired estimate holds with prob¬ 
ability 


1 - 8e"'‘ - 2 


/ 3p-8 

V6(p-8) 


p/2 

jY-(p-8)/8 


Case 3: = (l/2)exp(t") fTheorem 14.7p . As in Case 2 we apply 

Lemma 14.21 It implies that it is enough to estimate + ^/n Z + ^JC{a)N, 
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with C{a) from Lemma [4.21 and the corresponding probabilities. A direct 
calculations show that in this case we have for C'^ = (4/a)^/" and t > 1, 


r{{\x\ic'J > t] <2eMCL)ei' < 


tx 

We apply Lemma 03] with Zi = \ (Xj,a) ji < N^ q = 2 and s = 9e. 
It gives 


P 




for 


Z = (C';)^/^6\/6e\/]V. 

To estimate An we use Theorem 12.11 with t = and 

A = 10 (A^/n)“^‘^ min |l, {a ln(2A^/n))~^} . 


Note that 


max |4,10 (ln(2Ar/n))"^| < A < 10 . 

Then for absolute positive constants (P, C", 

Ml < {C\f" + <(—Y 


and 


/5< 


■ exp 


4^a/2 


+ 


a 




< 1/64, 


(10A^)4 V(3-51n(2n))2“y 2 exp((2nA^)“A) 

provided that N > (4/a)®/“. Thus with probability at least 1 — we have 

( run \ 2/a 

-) \/nN, 

a J 


where C" and C" are absolute positive constants. This together with the 
estimate for Z completes the proof (note that C{a) < C{2/aY^°‘). □ 
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5 The proof of Theorem 12.1 


In this section we prove the main technical result of this paper, Theorem 12.11 
which establishes upper bounds for norms of submatrices of random matrices 
with independent columns. Recall that for 1 < A; < iV the parameters Ak 
and Bk are defined by ([ 6 ]). 


5.1 Bilinear forms of independent vectors 

Let Xi,... Xn be independent random vectors and a G . Given disjoint 
sets T, S C {1,..., N} we let 


Q{a,T,S) 



( 12 ) 


with the convention that CLiXi = 0 . 

The following two lemmas are in the spirit of Lemma 2.3 in [25]. Recall 
that (s*)j denotes a non-increasing rearrangement of (|si|)i. 


Lemma 5.1 Let Xi,... Xn be independent random vectors in R”'. Let 7 G 
(1/2,1), I C {!,..., iV}, and a G R'^. Let k > |supp(a)|. Then there exists 
a G R^ such that supp(a) C supp (a), |supp(a)| < ■yk, |a| < |a|, and 


{ m+l—l m+l.—l 

Y. E ^ f ’ 

i=m i=m ) 


where i = [(1 — 7 )^;], m = [(7 — l/ 2 )fc], and 


IT, 



/or i G /, 


/or j G R. 


Proof. Let E C {1, ...,iV} be such that supp(a) C E and \E\ = k. Every¬ 
thing is clear when fc = 0 or 1, because then Q{a,I,E) = 0. Thus we may 
assume that k > 2. Let Ei = E r\ I and E 2 = E r\ E. First assume that 
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s ■= 1 -^ 1 1 > k/2. Note that (1 — ■y)k < k/2 < s, so that i < s. Let J C Fi 
be a set with \J\ = i such that the set {\Vj\ : j G J} consists of i smallest 
values among the values {\Vi\ : i G Fi}. (That is, J C Fi is such that \ J\ = I 
and for all j E J and i G Fi \ J we have \Vi\ > \Vj\.) Now we let 

Fi = Fi\J and F 2 = F 2 . 

Dehne the vector a G by the conditions 


®|Fi — ®|J — 0) ®|F2 — ®|F2- 

Thus a differs from a only on coordinates from J; in particular its support 
has cardinality less than or equal to |supp(a)| — |J| = s — i<k — i = 7 /c. 
Moreover, 


Q{a,I,F) = 


< 


E ttjXj 

OeFi jeF2 

I 

jeF2 

ieJ \ jeF2 


\i&Fi\J j&F2 

Q{a,I,n. 


Then we have 

Q(a,/,F)<g(a,/,r) + 5 ^ 


ieJ 


E ttjXj 


3 &F 2 


m+l—l 


<Q(a,I,n+ Y, V:<Q(a,I,P)+ Y 


i=s—l+l 


where m = [(7 — ^/2)k~\ and using that s — i + 1 > A;/2 — [(I — 7 ) A;] + 1 > 
(7 - 1 / 2 )A;. 

If |Fi| < A;/2 then IF 2 I > k/2 and we proceed similarly interchanging the 
role of Fi and F 2 and obtaining 

m+t—l 

Q{a,I,F)<Q{a,I,F)+ Y, 

i=m 

□ 
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Lemma 5.2 Let r > 1 and Xi, • • • ,X]\f be independent random vectors in 
M"" satisfying hypothesis H(0) for some function (p ^ Ai with parameter r. 
Let a G with |a| = 1. In the notation of Lemma \5.1[ for every t > 0 one 
has 



>tAk] < {2 tY 



( ty/YYk \\ 

V(1 -7)^ + 1// 


where {Lfi\i denotes either {Vi}i or and 7o = 7 — 1/2. 

Remarks. 1. Taking 0(t) = t^ for some p > 0, we obtain that if 

¥{\{Xi,a)\>t)<t-P 


then 

r < (2rr (Xp) 

Note that the condition ffT^ is satished if 

sup sup E| {Xi, a) 1^ < r. 

i<N aeS’^-l 


—mp 


2. Taking cp = (1/2) exp(a;“) for some a > 0, we obtain that if 

P (I {Xi, a) I > f) < 2 exp(— 


then 




\k-\-m 


P U* > tAk I < (2r)'—exp ( -m 

\ i=m / 

Note that the condition ffTSl) is satished if 


sup sup Eexp (I {Xi, a) |") < 2r. 

i<N ae5"-i 


(13) 

(14) 


(15) 


(16) 


Proof. Without loss of generality assume that Ui = Vi for every i. Then 


m+£—l 


i=m 
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Let Fi = supp(a) fl / and F -2 = supp(a) fl F. Note that V^> s means that 
there exists a set F C Fi of cardinality m such that Vi > s for every i E F 
(if cardinality of Fi is smaller than m, the estimate for probability is trivial). 
Since |Fi| < k, we obtain 

/m+e-i \ / h\ 

P 5^ K; > M, < P {IV,I > tA,) < (^ j max P 


(vieF: 


Denote Z := Ylj£F 2 ^j^r Since |a| < 1 then jZj < Ak, and note that the 
Xj’s, i E Fl are independent of Z. Thus, conditioning on Z we obtain 


^ m+£—l 


P 


< 2 ^ max TT P 

FCFi XX 
|F|=m ieF 



{X^,Z)\> 



< ( 2 r)^ 


max 

FCFi 
|F|=m i£F 


n 



-1 


Now we show that for every s > 0, 

Indeed, this estimate is equivalent to 

— R In 0 ( ) > In 0 (s^/m) , 

^ R vl«*l/ 

which holds by convexity of hi (^{l/s/x), the facts that |a| < 1 and |F| = m, 
and since 0 is increasing. Taking s = t/i, we obtain 

( m+£—l \ 

R >tAA< (2r)^ (0 {t . 

i=m / 

Finally note that m = [(7 — l/2)/i;] > 70 A; and £ = [(1 —^)k^ < (1 —^)k + 1. 
Since 0 is increasing, we obtain the last inequality, completing the proof. □ 
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5.2 Estimates for off-diagonal of bilinear forms 

For 1 < A; < and I C {1,iV} we define Qk{I) by 

Qk{I) = snp Q{a,E I,E I^). (17) 

\E\<k 2 

Lemmas 15.1115.21 and 14.11 imply the following proposition. 


Proposition 5.3 Let r > 1 and Xi, • • • , be independent random vectors 
in M” satisfying hypothesis H( 0 ) with parameter r for some function (f E Ai. 
Let e e (0,1/2), 2 < fc < X, / C {1,..., N}, 7 e (1/2,1), and 70 = 7 — 1/2. 
Then for every t > 0 one has 


P Qkii) > 




l-2e 


< exp ( fc ( In ~ 7o bi 0 




(1 - ^)k + 1 


Moreover, letting M = max* \Xi\ one has, for all i > 1 and t > 0, 

N^t 


F{Qe{I)>tM) < 


Af){At/i) ■ 


Proof. For every E (Z 1, ...,N with \E\ = k let A//; be an £-net in S® of 
cardinality at most (2.5/e)^. Let Af denote the union of A/e’s. Lemma IFTI 
yields 

Qfc(7) < (1 — 25 )“^ sup sup Q{a, E n I, E D E). 

BC{1,...,JV} aeJ^E 
\E\<k 

Therefore, applying Lemmas 15.11 and 15.21 we observe that the event 


Qk{I)<{l-2e)-^ 


sup sup Q{a, E r\ I, E r\ E) + 

Ec{i,...,N} aeAf 
|£ I <'yk 



occurs with probability at least 



/ ty/^ \\ 

\{l--f)k + l) J 


This implies the hrst estimate. 
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Now we prove the “moreover” part. For every E C ,iV} of cardi¬ 

nality £ denote Fi = EP[I,F 2 = Er\E,m = |Fi| (so IF 2 I = £ — m). We 
also denote 

Mn := max max I (Xj, X,) I and Mi := max IXJ 
i&I j&i^ 

Then for any a G B 2 we have 



< 

Oj Oj 

\ieFi jeF2 / 


ieFi jeF2 


< (5^ a?] ( «? ) ^0 < ^ 

VieFi / VjeF2 / 

Therefore, by the union bound, 


P(g^(/) > tMi) < P(Mo > 4tMi/g 

sZE P(|(W,Xg| >4fMi/g. 

i&i jei‘^ 


Finally, using the fact that Xj is independent of Xj for i ^ j, \Xj\ < Mi for 
every j G E, and using the tail behavior of variables (Xj, z), we obtain 

rmi) > tM) <rmi) > «M.) < 

□ 


Proposition 5.4 Let 1 < A; < X. Letr > 1 and Xi^ ■ ■ ■ ,X]\f be independent 
random vectors in M” satisfying hypothesis H(0) with parameter r for some 
function 0 G XI. Let t > 0, A > 1. 

Case 1. Let p> A and (f){x) = . Let a G (2,p/2). Then 


Qk{I) < 
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occurs with probability at least 


/ 2(a + A) y 1 N‘^T{a + X)P 
\5TeN{a-2)) 2A^ ~ 


and 


(18) 


C2{(P,X,p) 


a + X / 2p Y+'"/Y2(a + A)y"/^ 
l + A/2 Vp-2cTy V a-2 ) 


Case 2. Assume that (j){x) = (1/2) exp(a;“) for some a > 0. Then for 
every t > 0 , 

Qk{I) < j 

with probability at least 

1 / Afc“/2 \ ivV 

“ (lOriV)^ V (3.5 ln(2A:))2« J “ 2exp((2t)“)' 

Proof. Let 7 G (1/2,1) to be chosen later. For integers s > 0 denote = k, 
fcs+i = [yks]. Clearly, the sequence is strictly decreasing whenever kg > I 
and kg < 'y^k. Assume that k > 1/(1 — 7 ). Dehne m to be the largest integer 
m>l such that km-i > 1/(1 — 7 ). Note that 'ykm-i > 1- Therefore 


1 < /Cm < 


1-7 


^ km—I- 


(19) 


By Proposition 15.31 we observe that for every positive tg and Eg G (0,1/2), 
0 < s < m, the event 


Qk{i) < 


Qkmi^) + 


m—1 

E 

s=0 


tsA 


m—1 




occurs with probability at least 


m—1 


E / , /, dreN 

exp I /Cs I In — - 7 o In < 


s=0 


kgS g 


( tgy/^ y 

V(1 -7)fc + 1// 


( 20 ) 
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Let e: > 0 and a positive decreasing sequence be chosen later and set 

VTo^ y \ kgSg J 

where 0“^(s) = niin{t > 0 : 0(t) > s}. 

We start estimating Qk{I)- Since ln(l —x)> —2x on (0, 3/4], we observe 
that for Eg < 3/8, 


m—1 


m—1 


^ ln(l - 2eg) > ^ -Asg 


s=0 


s=0 


SO that 


Note that 


m—1 


m—1 


JJ (1 - 2es) ^ < exp 4 ^ 


s=0 


s=0 


m—1 


m—1 

ts^ks < 


s=0 


s=0 


Thus by fl20D and by our choice of tg, 


m—1 


m—1 


Qk{I) < exp 4 


( 21 ) 


s=0 


s=0 


with probability at least 

m—1 


f 5TeN\ f 5TeN\ 

1 — 2 2_^ exp I —kg e In —-j > 1 — 2 exp I —k^-i £ In —-j 2_^ 

s=0 s£s / V rn-l J 

Since km-i > 1/(1 — 7 ), this probability is larger than 


-kse 


1 — 2 exp 


1-7 


In (5re(l - 7 )iV) I ^ 


m—1 


^ks€ 


( 22 ) 


s=0 


Thus it is enough to choose appropriately Eg and to estimate Qkmi^) 

and W^e distinguish two cases for 0. 

Case 1: ((){x) = x^. In this case we choose = (s + 2)“^ so that 


m—1 


m—1 


m—1 




s=0 


s=0 


s=0 


2km—1£ 1 
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Choose e = A(1 — 7 ). Since A > 1 and fcm-i > 1/(1 “ 7 )) we have 2km-iS > 
2 e /(1 — 7 ) = 2A and 


m—1 




< 


1 

2A - 1' 


Using again fcm-i 
larger than 


> (1 — 7 ) we conclude that the probability in fl 2 ^ is 
1 - (5Te/V(l - 7))-" 5 ^. (23) 


Now we estimate 4- We have 


to = 


(1 - ^)ks + 1 _i / f5TeN\ (1 - ^)k, + 1 f5TeN\ 




k.e. 


■) 


Vloks 


\ kgEg J 


Recall that 7 > 1/2, k^-i > 1/(1 — 7 ), so that (1 — jjkg + 1 < 2(1 — jjkg 
for s < m — 1. Thus 

^ 2(l-j)Vkg 

' " x/^ V ksSs J 

Let 6 = (1 + e:)/ 7 op. Assume that b < 1 / 2 . Since kg < 7 ^fc, we have 


m—1 


2(1 - 'y)k^/‘^-\5TeN) 


b 


^ ^ J)- -2) (24) 

s=0 s=0 

Since the function h{z) = ^ 2 b^ 2 :(i/ 2 -b) ]^+ jg gj,g|- increasing and then 

decreasing, we get 


m—1 


m+1 


^(s + 2)2 h{s) < 7 -^ sup h{z) + / h{z) dz 


s=0 


s=2 


z>0 


< 2 


/ 2b 

V(l/2 - 6)eln(l/7) 


2b 


+ 


r(i + 26) 


(( 1/2 — b) ln(l/ 7 ))i+ 2 ^ 


As 2b < 1, r(l + 2b) < 1. Using also that ln(l/ 7 ) > 1 — 7 , we observe that 
the previous quantity does not exceed 


4 

((1/2-6)(1-7))i+2^- 
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( 25 ) 


Coming back to (jUj), we get 


m —1 

^t.< 

s=0 


8k^/‘^-\5TeNy 


(1/2 - 6)i+2'>(l - 7)2b y'7- 1/2' 

To conclude this computation, we choose the parameter 

1 + A + a/2 


7 = 


a + A 


Note that 7 G (1/2,1) as required, since A > 1 and 2 < a. With such a 
choice of 7 , we have b = a/p < 1/2, since a < p/2. Thus from fj25ll and fl2^ 


m—1 


dreN / p \ i + 2 <^/p / ^ ^ 




holds with probability larger than 

1 _ (sreiV ^ 


a/ 2-1 


2a/p 


a + A 
1 + A/2 


a + A / 2A - 1 
Finally, to estimate Qkm^ 'we note that 

, 1 a + A 

k-m < -= —;-5 

1-7 a/2-1’ 

and apply “moreover” part of Proposition 15.31 (with i = km)- Note that at 
the beginning of the proof we assumed that fc > 1/(1 — 7). In the case k < 
1/(1 — 7) the result trivially holds by the “moreover” part of Proposition 15.31 
applied with i = k. 

Case 2: (j){x) = (1/2) exp(x“). In this case we choose 7 = 2/3, so 
that 7o = 1 /6. As before we assume that k > 1/(1 — 7) = 3 (otherwise 
Qk{I) < Q 2 {I))- By (IT^ we have < 3, hence, by (!?!]) 


m—1 


m—1 


Qk{I) < exp I 4 
We dehne ks by 


Q2{i) + Afc tg I. 


s=0 


s=0 


£5 = - exp - — 


a/2 


(s + 2 ) 


2a 
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Observe that since ks < 'y^k and 7 = 2/3, one has 


£. < 2 exp I - 


3 'N 1 


(s + 2 ) 


2a 


<^(s + 2 ) 


2a 


50/2 


which implies 


m—1 ^ 

n 

s=0 


( 26 ) 


for a positive absolute constant C. 
We have 


ts = Vq 


ks/3 + f5TeN \^ ks/3 + 1 


\/ ks 


V kgSg J 


\/ ks 




By flT^ we have k^ < 3 < /Cm-i, hence, 


+ In — 

2 e, 

20 reiV\^/“ 


< \/6 ^ a/^ (6 (1 + ^In 

< v/6 I 2'/" 77 (6(1+£))'/» ((in^) + (in^) 


1 /q;'' 


By the choice of Eg we obtain 


m—1 


1 


^ (in 


V 2e, 


1/q; 


m—1 


— + 2 ) ^ < sVk. 


(27) 


s=0 A ^ 

Since 3~^k < kg < (2/3)^/c, we observe 


m—1 


20 reiV\^/“ 


m—1 


5^77 < 7 ^ 5 : 

s=0 A s s / ^_Q 


2 \*/^ / 20reA^3"\^/' 


< 


1 / 2\ 


\s=0 ^ ^ ^ ^ s=0 


s=0 

20 reiV\^/" — 

+ 

s=0 


In 


k 


s/2 


(2sln3)^/" 


< i (|ln ^2^ j ^ + r(l + l/a) j , 
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where Ci is an absolute positive constant and T is the Gamma function. This 
together with fl27|) implies that 


m—1 


^ G < (G2(1 + Vk 


fr 20reArV^" 


s=0 




In 


k J 


+ r(i + i/o^) 


where C 2 is an absolute positive constant. 

Now we estimate the probability. By the choice of kg we have 


( 28 ) 


m—1 m—1 m—1 

= ^exp(-£fc,ln(l/e,)) = ^exp(-£A:, {ln2 + {k/kg)^/\s + 2)-^^)) 

s=0 s=0 s=0 


m—1 

< exp {-ekl-^/^ k^/^{s + 2 )"^“) . 

s=0 


Since kg > km-i > 1/(1 “ 7 ) and s + 2 < m + 1 for every s < m — 1, we 
get that 

e: 


m—1 


< jyi gxp 


s=0 


(1 — {jri + 1)^“^ 

Since m is chosen such that 1/(1 — 7 ) < km-i < (2/3)”^“^fc, we observe that 

ln(/c(l - 7 )) 


m — 1 < 


ln(3/2) 


Therefore, 


m—1 


y- ek^ < (1 + exp 

-l^+ln(3/2)r"P 


ka/2 


(l/3)i-«/2 (2.5 lnfc)2« 


< 2 exp —3e: 


ka/2 


'3“/2 (2.5 lnA:)2“ J ’ 
which shows that probability in fl22l) is at least 

^a/2 


(15reiV)^^ 


exp I — 3£ 


(3.5 Infc) 


2a 


Finally, to estimate Q 2 {I) we apply the “moreover” part of Proposition lS^ 
(with £ = 2). Choosing e = A/3 and combining estimates fl26|) . and fl28|l with 
the estimate for Q 2 {,I) we obtain the desired result. □ 
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5.3 Estimating Ak and 

We are now ready to pass to the proof of Theorem 12.![ To prove the theorem 
we need two simple lemmas. 

Lemma 5.5 Let (3 G (0,1). Let Pi and P 2 be probability measures on fli 
and 122 respectively and let V <Z Lli fl 2 be such that 

Pi ®P2(V^) >1- Id. 

Then there exists hh C 122 such that 
P 2 (W) > 1 - \/]3 and \/x 2 G W, Pi ({xi : (xi,X 2 ) G Id}) > 1 - 
Proof. Fix some 6 G (0,1). Let 

W := {X2 G fl2 : Pi ({xi G fli : (xi,X2) G P}) > 1 - 5}. 

Clearly, 

= {X2 G 122 : Pi ({xi G 12i : (xi,X2) G W}) > 5}. 

Then 

/3>Pi(8)P2(W) = f Pi({xiG12i: (xi,X 2 ) G W}) dP 2 (x 2 ) 

Jn2 

> / Pi ({xi G 12i : (xi,X2) G VA) dP2(x2) > <5P2(W"), 

Jw<= 

which means P 2 (hF) > 1 — (3/5. The choice 5 = y/]3 completes the proof. □ 
The following lemma is obvions. 

Lemma 5.6 Let xi,..., xn G P”', then 

^{xi,xj) = 2^-^ ^^(xi,x^). 

/C{1,...,V} i&I 
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Proof of Theorem 12.IL From Lemma [5.61 we have 


N 

2 TV 



/ 

\ 



= 2^-^ 

E 1 



i=l 

i=l 


/C{l,2,-,Af} 

\ i£l 

je/= / 


We deduce that 

Bl < 2^-^ sup ^ Q(a, /, B) < 2^-^ ^ sup Q(a, /, /^) 

a&Uk 

< 2^-^ g,(/). 

Let / C {1,..., N} be fixed. Proposition 15.41 implies 

F{Qk{I)<Mo)>l-/3, (29) 

where 

Mo := C^t max|W| + (d^i/4) Ak- 

i<N 

Consider two probability spaces {/ : I C with the normalized 

counting measure /r and our initial probability space (f2,P), on which W’s 
are defined. By (|2^ we observe that the // <8) P probability of the event 
V := {Qk{I) < Mq} is at least 1 — /9. Then Lemma 15.51 implies that there 
exists W <zVt such that P(hF) > 1 — \f]3 and such that for every oj eW one 
has < Mo}) > 1 - \/^. Since Qk{I) < Al, we obtain that for every 

CU e IT, 

Bl<iMo + iy/^Al. 

Since A\ < maxj< 7 v |Xjp + we have 

2 , 4Mo + maxi<Ar |Xj|^ . ^ 4(Mo + maxi<Ar | WH 

l-4v^ - l-4v^ ^ ^ 

Therefore 

max |Xjp + max |W| + MiA]^ . 

\ i<N i<N J 

Using \/m 2 -|- -yS -c u + and denoting 7 = (1 — (recall M = 

maxj< 7 v |dfj|) we obtain 

Ak ^ \/ 7 AI + 2'y/C 0'7UM + 7Ml, 
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which proves the estimate for A^. Plugging this into we also observe 

< 7 (4 m 2 + AC^tM + -iMl + y^MMi + 

< 7 [a^/^ m 2 + SC^tM + 2-fMl + ^MMl) . 

This completes the proof. □ 


6 Optimality 


In this section we discuss optimality of estimates in Theorems 12.11 and 13.11 
In Propositions 16.5116.61 and 16.71 we will prove results justifying remarks on 
optimality following these theorems. 

To obtain the lower estimates on Am we use the following observation. 


Lemma 6.1 Let A = {Xij)i<n,j<N he an n x N matrix with i.i.d. entries. 
Then 


HAm >t)>^ 


whenever 



> 



m + 1 

> - 

- N 


(31) 


Proof. For every i < N, let Xj E M"" be the j-th columns of A. For m < N 
we have 


Am = sup 

N 

> sup 

N 

> sup 

N 

aeUm 

i=i 

a^Um 

i=i 

aGUm 

aj£{±l/y/rn,0} 

i=i 




Im* 


j=l 


Therefore, using independence, we have 


= P(F > m). 


where T is a real random variable with a binomial distribution of size N 
and parameter v = P(|Xii| > It is well known that the median of P, 

med(F) satisfies 

[Nv\ < med (F) < [Xn]. 
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Thus F{Am ^t) > ^ whenever m < [iVuJ. This implies the result. □ 

To evaluate RIP, we will use the following simple observation. 

Lemma 6.2 Let n < N and m < N. Let A he an n x N random matrix 
satisfying 

P(^m > t^/m) > 

Assume also that A satisfies RIPm{d) for some 5 < 1 with probability greater 
than 1/2. Then 

mt^ < 2n. 

Proof. As A satisfies RIPm((^) for some 5 < 1 with probability greater than 
1/2, then clearly 

= sup I < 2n 

with probability greater than 1/2. Therefore, with positive probability one 
has 

t\/m < Ajn < \/^, 

which implies the result. □ 


In order to show that a matrix with i.i.d. random variables satisfies 
condition H(0) with 0(t) = t^ we need the Rosenthal’s inequality ([29], see 
also HZ])- As usual, by || • ||g for a random variable f we mean its Lg-norm 
and for an a G M”' its £g-norm, that is 

lien, = (Eien'-'’ ii<i|i,= 

Vi=i 

Note that originally the Rosenthal inequality was proved for symmetric ran¬ 
dom variables, but using standard symmetrization argument (i.e., passing 
from random variables ffs to {fi — ^()’s, where (^()’s have the same distribu¬ 
tion and are independent), one can pass to centered random variables. 



Lemma 6.3 Let q > 2 and a G ML. Let ..., fn be i.i.d. centered ran¬ 
dom variables with finite q-th moment. Then there exists a positive absolute 
constant C such that 



< 


n 

'y ^ 

i=l 


<C^M, 

In q 


(32) 
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where Mg := max{||a|| 2 || 6 l| 2 , llalUl^llg}- 

The following is an almost immediate corollary of Rosenthal’s inequality. 
It should be compared with Proposition 1.3 of |3T]. 


Corollary 6.4 Let p > 4. Let ^ be a random variable of variance one and 
with a finite p-th moment. Let fij, i < n, j < N be i.i.d. random variables 
distributed as f. Then for every t > 0, 


P max 
yj<N 




N 


where C is a positive absolute constant. 


Proof. Let ^i, ..., fn be i.i.d. random variables distributed as f. We apply 
Rosenthal’s inequality to random variables {ff — 1) with q = p/2 and a = 
(1,Then 


n 


Ek? -1) 


p/2 




l||p/2 < Cps/n (||.C^||p/2 + l) < ‘^Cpy/n\\f\\p, 


where Cp = Cp/ Inp for an absolute positive constant C. Using Chebyshev’s 
inequality we observe 


P 



lESr=ilg-lK^ ^ 

(tn)P/‘^ ~ flP/4: 


The result follows by the union bound. 


□ 


As is mentioned in remarks on optimality following Theorem 12.11 the next 
proposition gives a lower bound for Am to be compared with Case 1 of The¬ 
orem EH 


Proposition 6.5 Let p>2, l<m<N. There exists a sequence of inde¬ 
pendent random vectors Xi, - ■ ■ ,Xi\i in ML satisfying 

VI < i < iV Va e E| {Xi, a) 1^ < 1 (33) 


and such that 


P 



> 


Cp 

hip 


m 






where C is an absolute positive constant. 
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Proof. Let A > 1 to be set later and let us put 


fp{x) = { 2(l-A-)lb-+^ 


if 1 < |a:| < A 
otherwise. 


We have J fp{x) dx = 1 and 


In A 


Op ;= J \xffpix) dx = 

Consider the random variable ^(cn) = u with respect to the density fp and 
let {Xij) be i.i.d. copies of Clearly, E|Xii|^ = 1. Since, for s G [1, A] 


p(ki>i>) = 


1 - A-P \ sf \r 


a short computation using shows that P(Am > t) > A provided that 


t < 


I-x-p\ i/p 


p In A 


N 


= vm 


p In A 


i/p 


(m + 1)(1 - X-P) + NX-P 

N A 

m + 1 + N/{XP - 1) 


i/p 


Choosing A from A^ — 1 = N/{m + 1), we obtain > t) > | provided 

that 


t < Wm 


N 


i/p 


2{m + 1) ln(2iV/(m + 1)) 

Finally, to satisfy condition (13^ . we pass from matrix A io A' = A/cp = 
{Xij/cp)ij, where Cp < Cp/lnp is a constant in Rosenthal’s inequality ([32]). 
By Rosenthal’s inequality, the sequence of columns of A' satishes the condi¬ 
tion (|33|). □ 


The next proposition gives an upper bound on the size of sparsity m in 
order to satisfy RIP under condition of Case 1 of Theorem 13.11 (see Remark 3 
following this theorem). 
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Proposition 6.6 Let q>p>2, n<N and m < N. There exist an 
absolute positive constant C, an n x N matrix A, whose columns Xi, ...,X]y 
are independent random vectors satisfying 


VI < i < iV Va e 


E\{X„a) |P< 


Cp\^ q 
Inp J q — p 



p/2 


. (34) 


and for every t G (0,1), 


( max 


\ j<W 

n 



< 


(35) 


provided that 


N<i /'"V 

\C{q-2)pJ q 


Assume that A satisfies RIPm{d) for some 5 < 1 with probability greater than 
1/2. Then 

N 2(g-2) 

m 1 - 1 <- n. 

m + 1 


Proof. Consider the density 




0 


otherwise. 


We have J f{x) dx = 1, 


\xff{x) dx = 


q 


q-p 


and a\ ;= / \x\‘^f{x) dx = 


q-2 


Consider the random variable f{u) = oj with respect to the density / and let 
{Xij)ij be i.i.d. copies of ^/a 2 . Clearly, 


E|Xii| 2 = 1 and E|Xii|p = 


q 


q — p \ q 


q-2 


p/2 


Then Rosenthal’s ineqnality fl32|) implies the condition fl3T|l and Corollary 16.41 
implies fl35|) . 
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Now we estimate Am for the matrix A, whose columns are j < N. 

Since, for s > 1, P (|^| > s) = s~‘’, by (1^ . we obtain that P(Am > | 

provided that 



/ N y/" 

\m + 1/ 


This means 



N 

m + 1 



and we complete the proof applying Lemma 16.21 


□ 


The next proposition shows the optimality (up to absolute constants) of 
the sparsity parameter in Case 2 of Theorem 13 .ll fsee Remark 4 following this 
theorem) as well as optimality of bounds for Am in Case 2 of Theorem 12.11 
(see remarks on optimality following this theorem). 


Proposition 6.7 There exist absolute positive constants c, C such that the 
following holds. Let a G [1, 2], 1 < m < N/2 and n satisfies N < exp(cn"/^). 
There exists an n x N matrix A, whose columns Xi, are independent 

random vectors satisfying 


VI < i Va e Eexp (| {X„ a) ]“) < C (36) 


and 


P 


max 

i<N 





< 2exp(-cn"/^), 


and such that 



(37) 


(38) 


Additionally, if n < N and if A satisfies RIPm{d) for some 5 < 1 with 
probability greater than 1/2, then 


m 



N 

m + 1 


2/a 

< 4n. 


Proof. We consider a symmetric random variable ^ with the distribution 
dehned by P (|^| > t) = exp(—t"). It is easy to check that 


Eexp(|e| 72 ) = 2 
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and 


a ■= = r 


(i + i) 6 11,2], 


Let Xij, i < n, j < N he i.i.d. copies of ^/y/a, A = and X^’s be its 

colnmns. Applying Lemma 3.4 from |1] (see also Theorem 1.2.8 in m) we 
observe that Xj’s satisfy conditions (|36|1 and (l37|l . By (|3Tll we observe that 
> I provided that 



Thns it is enongh to take 



This proves the estimate fl38ll . 


Finally, the “additionally” part follows by Lemma [6.21 


□ 
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